Goto

Collaborating Authors

 visually impaired


An Embedded Real-time Object Alert System for Visually Impaired: A Monocular Depth Estimation based Approach through Computer Vision

Anjom, Jareen, Chowdhury, Rashik Iram, Hasan, Tarbia, Hossain, Md. Ishan Arefin

arXiv.org Artificial Intelligence

Visually impaired people face significant challenges in their day-to-day commutes in the urban cities of Bangladesh due to the vast number of obstructions on every path. With many injuries taking place through road accidents on a daily basis, it is paramount for a system to be developed that can alert the visually impaired of objects at close distance beforehand. To overcome this issue, a novel alert system is proposed in this research to assist the visually impaired in commuting through these busy streets without colliding with any objects. The proposed system can alert the individual to objects that are present at a close distance. It utilizes transfer learning to train models for depth estimation and object detection, and combines both models to introduce a novel system. The models are optimized through the utilization of quantization techniques to make them lightweight and efficient, allowing them to be easily deployed on embedded systems. The proposed solution achieved a lightweight real-time depth estimation and object detection model with an mAP50 of 0.801.


VocalEyes: Enhancing Environmental Perception for the Visually Impaired through Vision-Language Models and Distance-Aware Object Detection

Chavan, Kunal, Balaji, Keertan, Barigidad, Spoorti, Chiluveru, Samba Raju

arXiv.org Artificial Intelligence

With an increasing demand for assistive technologies that promote the independence and mobility of visually impaired people, this study suggests an innovative real-time system that gives audio descriptions of a user's surroundings to improve situational awareness. The system acquires live video input and processes it with a quantized and fine-tuned Florence-2 big model, adjusted to 4-bit accuracy for efficient operation on low-power edge devices such as the NVIDIA Jetson Orin Nano. By transforming the video signal into frames with a 5-frame latency, the model provides rapid and contextually pertinent descriptions of objects, pedestrians, and barriers, together with their estimated distances. The system employs Parler TTS Mini, a lightweight and adaptable Text-to-Speech (TTS) solution, for efficient audio feedback. It accommodates 34 distinct speaker types and enables customization of speech tone, pace, and style to suit user requirements. This study examines the quantization and fine-tuning techniques utilized to modify the Florence-2 model for this application, illustrating how the integration of a compact model architecture with a versatile TTS component improves real-time performance and user experience. The proposed system is assessed based on its accuracy, efficiency, and usefulness, providing a viable option to aid vision-impaired users in navigating their surroundings securely and successfully.


Real-Time Pill Identification for the Visually Impaired Using Deep Learning

Dang, Bo, Zhao, Wenchao, Li, Yufeng, Ma, Danqing, Yu, Qixuan, Zhu, Elly Yijun

arXiv.org Artificial Intelligence

The prevalence of mobile technology offers unique opportunities for addressing healthcare challenges, especially for individuals with visual impairments. This paper explores the development and implementation of a deep learning-based mobile application designed to assist blind and visually impaired individuals in real-time pill identification. Utilizing the YOLO framework, the application aims to accurately recognize and differentiate between various pill types through real-time image processing on mobile devices. The system incorporates Text-to- Speech (TTS) to provide immediate auditory feedback, enhancing usability and independence for visually impaired users. Our study evaluates the application's effectiveness in terms of detection accuracy and user experience, highlighting its potential to improve medication management and safety among the visually impaired community. Keywords-Deep Learning; YOLO Framework; Mobile Application; Visual Impairment; Pill Identification; Healthcare


Viia-hand: a Reach-and-grasp Restoration System Integrating Voice interaction, Computer vision and Auditory feedback for Blind Amputees

Peng, Chunhao, Yang, Dapeng, Cheng, Ming, Dai, Jinghui, Zhao, Deyu, Jiang, Li

arXiv.org Artificial Intelligence

Visual feedback plays a crucial role in the process of amputation patients completing grasping in the field of prosthesis control. However, for blind and visually impaired (BVI) amputees, the loss of both visual and grasping abilities makes the "easy" reach-and-grasp task a feasible challenge. In this paper, we propose a novel multi-sensory prosthesis system helping BVI amputees with sensing, navigation and grasp operations. It combines modules of voice interaction, environmental perception, grasp guidance, collaborative control, and auditory/tactile feedback. In particular, the voice interaction module receives user instructions and invokes other functional modules according to the instructions. The environmental perception and grasp guidance module obtains environmental information through computer vision, and feedbacks the information to the user through auditory feedback modules (voice prompts and spatial sound sources) and tactile feedback modules (vibration stimulation). The prosthesis collaborative control module obtains the context information of the grasp guidance process and completes the collaborative control of grasp gestures and wrist angles of prosthesis in conjunction with the user's control intention in order to achieve stable grasp of various objects. This paper details a prototyping design (named viia-hand) and presents its preliminary experimental verification on healthy subjects completing specific reach-and-grasp tasks. Our results showed that, with the help of our new design, the subjects were able to achieve a precise reach and reliable grasp of the target objects in a relatively cluttered environment. Additionally, the system is extremely user-friendly, as users can quickly adapt to it with minimal training.


MagicEye: An Intelligent Wearable Towards Independent Living of Visually Impaired

Sethuraman, Sibi C., Tadkapally, Gaurav R., Mohanty, Saraju P., Galada, Gautam, Subramanian, Anitha

arXiv.org Artificial Intelligence

Individuals with visual impairments often face a multitude of challenging obstacles in their daily lives. Vision impairment can severely impair a person's ability to work, navigate, and retain independence. This can result in educational limits, a higher risk of accidents, and a plethora of other issues. To address these challenges, we present MagicEye, a state-of-the-art intelligent wearable device designed to assist visually impaired individuals. MagicEye employs a custom-trained CNN-based object detection model, capable of recognizing a wide range of indoor and outdoor objects frequently encountered in daily life. With a total of 35 classes, the neural network employed by MagicEye has been specifically designed to achieve high levels of efficiency and precision in object detection. The device is also equipped with facial recognition and currency identification modules, providing invaluable assistance to the visually impaired. In addition, MagicEye features a GPS sensor for navigation, allowing users to move about with ease, as well as a proximity sensor for detecting nearby objects without physical contact. In summary, MagicEye is an innovative and highly advanced wearable device that has been designed to address the many challenges faced by individuals with visual impairments. It is equipped with state-of-the-art object detection and navigation capabilities that are tailored to the needs of the visually impaired, making it one of the most promising solutions to assist those who are struggling with visual impairments.


DRISHTI: Visual Navigation Assistant for Visually Impaired

Joshi, Malay, Shukla, Aditi, Srivastava, Jayesh, Rastogi, Manya

arXiv.org Artificial Intelligence

In today's society, where independent living is becoming increasingly important, it can be extremely constricting for those who are blind. Blind and visually impaired (BVI) people face challenges because they need manual support to prompt information about their environment. In this work, we took our first step towards developing an affordable and high-performing eye wearable assistive device, DRISHTI, to provide visual navigation assistance for BVI people. This system comprises a camera module, ESP32 processor, Bluetooth module, smartphone and speakers. Using artificial intelligence, this system is proposed to detect and understand the nature of the users' path and obstacles ahead of the user in that path and then inform BVI users about it via audio output to enable them to acquire directions by themselves on their journey. This first step discussed in this paper involves establishing a proof-of-concept of achieving the right balance of affordability and performance by testing an initial software integration of a currency detection algorithm on a low-cost embedded arrangement. This work will lay the foundation for our upcoming works toward achieving the goal of assisting the maximum of BVI people around the globe in moving independently.


AI empowering the Visually Impaired

#artificialintelligence

Some See Disability as a Unique Individuality, and why not. Well, this individuality needs to be taught and explored and so here are a few applications and technologies developed for those with special needs of visuals and sight. Microsoft has released a new free app for Apple's iPhone, called Seeing AI, and it is generating a lot of interest in a short period of time. In less than a week, "techies" were giving glowing reviews to the app and podcasters were creating tutorials, which is all great news for consumers looking for an introduction to using the app. So what are the various means that AI is empowering the Visually Impaired?


Computing and Assistive Technology Solutions for the Visually Impaired

Communications of the ACM

The idea of "reinventing the wheel" is very often looked down upon in research. But many devices and solutions in the assistive technology (AT) space have been available for nearly half a century and still have not reached most users in low-income countries. Two such examples are refreshable Braille displays, which make digital data accessible in Braille through touch rather than audio, and tactile diagrams, which are critical to helping visually impaired people to pursue subjects, such as science, where diagrams are crucial to understanding the concepts. While accessibility normally refers only to the modality for making information accessible, in the Indian context, it is tightly tied to affordability. No market exists in the AT space in low-income countries, though the need is very high, because the user's ability to pay is either low or non-existent.


ViT Cane: Visual Assistant for the Visually Impaired

Kumar, Bhavesh

arXiv.org Artificial Intelligence

Blind and visually challenged face multiple issues with navigating the world independently. Some of these challenges include finding the shortest path to a destination and detecting obstacles from a distance. To tackle this issue, this paper proposes ViT Cane, which leverages a vision transformer model in order to detect obstacles in real-time. Our entire system consists of a Pi Camera Module v2, Raspberry Pi 4B with 8GB Ram and 4 motors. Based on tactile input using the 4 motors, the obstacle detection model is highly efficient in helping visually impaired navigate unknown terrain and is designed to be easily reproduced. The paper discusses the utility of a Visual Transformer model in comparison to other CNN based models for this specific application. Through rigorous testing, the proposed obstacle detection model has achieved higher performance on the Common Object in Context (COCO) data set than its CNN counterpart. Comprehensive field tests were conducted to verify the effectiveness of our system for holistic indoor understanding and obstacle avoidance.


Technologies for the Visually Impaired

Communications of the ACM

Navigation is a huge part of the value smartphones provide for the blind and visually impaired. Thanks to recent advances in technology, the blind and visually impaired are now able to lead more independent lives than ever. The WeWALK Smart Cane is a great example of what is now possible. The WeWALK looks similar to the cane that some blind and visually impaired people have used for decades to avoid obstacles while walking, but it incorporates a few modern twists. With a standard cane, you can still run into obstacles that are not immediately underfoot, like poles, tree branches, and barriers.